Multimodal Emotion Recognition


Multimodal emotion recognition is the process of recognizing emotions from multiple modalities, such as speech, text, and facial expressions.

A Baseline Multimodal Approach to Emotion Recognition in Conversations

Add code
Jan 31, 2026
Viaarxiv icon

AmbER$^2$: Dual Ambiguity-Aware Emotion Recognition Applied to Speech and Text

Add code
Jan 25, 2026
Viaarxiv icon

Emotion-LLaMAv2 and MMEVerse: A New Framework and Benchmark for Multimodal Emotion Understanding

Add code
Jan 23, 2026
Viaarxiv icon

STARS: Shared-specific Translation and Alignment for missing-modality Remote Sensing Semantic Segmentation

Add code
Jan 24, 2026
Viaarxiv icon

Not all Blends are Equal: The BLEMORE Dataset of Blended Emotion Expressions with Relative Salience Annotations

Add code
Jan 19, 2026
Viaarxiv icon

Scaling Ambiguity: Augmenting Human Annotation in Speech Emotion Recognition with Audio-Language Models

Add code
Jan 21, 2026
Viaarxiv icon

A Unified Framework for Emotion Recognition and Sentiment Analysis via Expert-Guided Multimodal Fusion with Large Language Models

Add code
Jan 12, 2026
Viaarxiv icon

Can Vision-Language Models Understand Construction Workers? An Exploratory Study

Add code
Jan 15, 2026
Viaarxiv icon

MCGA: A Multi-task Classical Chinese Literary Genre Audio Corpus

Add code
Jan 14, 2026
Viaarxiv icon

E^2-LLM: Bridging Neural Signals and Interpretable Affective Analysis

Add code
Jan 11, 2026
Viaarxiv icon